Knockoff: Cheap Versions in the Cloud
نویسندگان
چکیده
Cloud-based storage provides reliability and ease-ofmanagement. Unfortunately, it can also incur significant costs for both storing and communicating data, even after using techniques such as chunk-based deduplication and delta compression. The current trend of providing access to past versions of data exacerbates both costs. In this paper, we show that deterministic recomputation of data can substantially reduce the cost of cloud storage. Borrowing a well-known dualism from the faulttolerance community, we note that any data can be equivalently represented by a log of the nondeterministic inputs needed to produce that data. We design a file system, called Knockoff, that selectively substitutes nondeterministic inputs for file data to reduce communication and storage costs. Knockoff compresses both data and computation logs: it uses chunk-based deduplication for file data and delta compression for logs of nondeterminism. In two studies, Knockoff reduces the average cost of sending files to the cloud without versioning by 21% and 24%; the relative benefit increases as versions are retained more frequently.
منابع مشابه
A risk model for cloud processes
Traditionally, risk assessment consists of evaluating the probability of "feared events", corresponding to known threats and attacks, as well as these events' severity, corresponding to their impact on one or more stakeholders. Assessing risks of cloud-based processes is particularly difficult due to lack of historical data on attacks, which has prevented frequency-based identification...
متن کاملA Pseudo Knockoff Filter for Correlated Features
In 2015, Barber and Candès introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and prove that this method achieves exact FDR control. Inspired by the work of Barber and Candès (2015), we propose and analyze a pseudoknockoff filter that inherits some advantages of the original knockoff filter and has more flexibility in constructing ...
متن کاملSome Analysis of the Knockoff Filter and its Variants
In many applications, we need to study a linear regression model that consists of a response variable and a large number of potential explanatory variables and determine which variables are truly associated with the response. In 2015, Barber and Candès introduced a new variable selection procedure called the knockoff filter to control the false discovery rate (FDR) and proved that this method a...
متن کاملThe knockoff filter for FDR control in group-sparse and multitask regression
We propose the group knockoff filter, a method for false discovery rate control in a linear regression setting where the features are grouped, and we would like to select a set of relevant groups which have a nonzero effect on the response. By considering the set of true and false discoveries at the group level, this method gains power relative to sparse regression methods. We also apply our me...
متن کاملCloud manufacturing system
Cloud manufacturing is defined as a relationship between the consumer and a flexible array of production services, managed by an intervening architecture that can match service providers to product and manufacturing processes Cloud manufacturing definitions typically make explicit or imply three groups of actors: consumers, who request and use cloud manufacturing processes; application provider...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017